ai agent
Emergent Risk Awareness in Rational Agents under Resource Constraints
Advanced reasoning models with agentic capabilities (AI agents) are deployed to interact with humans and to solve sequential decision-making problems under (approximate) utility functions and internal models. When such problems have resource or failure constraints where action sequences may be forcibly terminated once resources are exhausted, agents face implicit trade-offs that reshape their utility-driven (rational) behaviour. Additionally, since these agents are typically commissioned by a human principal to act on their behalf, asymmetries in constraint exposure can give rise to previously unanticipated misalignment between human objectives and agent incentives. We formalise this setting through a survival bandit framework, provide theoretical and empirical results that quantify the impact of survival-driven preference shifts, identify conditions under which misalignment emerges and propose mechanisms to mitigate the emergence of risk-seeking or risk-averse behaviours. As a result, this work aims to increase understanding and interpretability of emergent behaviours of AI agents operating under such survival pressure, and offer guidelines for safely deploying such AI systems in critical resource-limited environments.
AI agents don't need more hype. They need a map
A new standard called Agentic Resource Discovery (ARD), backed by Google, Microsoft, and Nvidia, aims to create a "Google Search for AI" to help agents find online services. This standardization could transform the chaotic "agentic web" into an efficient system where AI agents perform complex tasks reliably and effectively. AI is coming for us, they keep saying. We'll all have AI agents acting on our behalf, doing everything from our weekly grocery shopping to booking airline tickets. It's gonna change everything, just like the web did!
MineAny Build: Benchmarking Spatial Planning for Open-world AIAgents
Spatial Planning is a crucial part in the field of spatial intelligence, which requires the understanding and planning about object arrangements in space perspective. AI agents with the spatial planning ability can better adapt to various real-world applications, including robotic manipulation, automatic assembly, urban planning etc. Recent works have attempted to construct benchmarks for evaluating the spatial intelligence of Multimodal Large Language Models (MLLMs). Nevertheless, these benchmarks primarily focus on spatial reasoning based on typical Visual QuestionAnswering (VQA) forms, which suffers from the gap between abstract spatial understanding and concrete task execution. In this work, we take a step further to build a comprehensive benchmark called MineAnyBuild, aiming to evaluate the spatial planning ability of open-world AI agents in the Minecraft game. Specifically, MineAnyBuild requires an agent to generate executable architecture building plans based on the given multi-modal human instructions.
Security Challenges in AIAgent Deployment: Insights from a Large Scale Public Competition
Recent advances have enabled LLM-powered AI agents to autonomously execute complex tasks by combining language model reasoning with tools, memory, and web access. But can these systems be trusted to follow deployment policies in realistic environments, especially under attack? To investigate, we ran the largest public red-teaming competition to date, targeting 22 frontier AI agents across 44 realistic deployment scenarios. Participants submitted 1.8 million promptinjection attacks, with over 60,000 successfully eliciting policy violations such as unauthorized data access, illicit financial actions, and regulatory noncompliance. We use these results to build the Agent Red Teaming (ART) benchmark--a curated set of high-impact attacks--and evaluate it across 19state-of-the-art models.
AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents
Autonomous AI agents that can follow instructions and perform complex multi-step tasks have tremendous potential to boost human productivity. However, to perform many of these tasks, the agents need access to personal information from their users, raising the question of whether they are capable of using it appropriately.
Emergent Risk Awareness in Rational Agents under Resource Constraints
Advanced reasoning models with agentic capabilities (AI agents) are deployed to interact with humans and to solve sequential decision making problems under (often approximate) utility functions and internal models. When such problems have resource or failure constraints where action sequences may be forcibly terminated once resources are exhausted, agents face implicit trade offs that reshape their utility-driven (rational) behaviour. Additionally, since these agents are typically commissioned by a human principal to act on their behalf, asymmetries in constraint exposure can give rise to previously unanticipated misalignment between human objectives and agent incentives. We formalise this setting through a survival bandit framework, provide theoretical and empirical results that quantify the impact of survival driven preference shifts, identify conditions under which misalignment emerges and propose mechanisms to mitigate the emergence of risk-seeking or risk-averse behaviours. As a result, this work aims to increase understanding and interpretability of emergent behaviours of AI agents operating under such survival pressure, and offer guidelines for safely deploying such AI systems in critical resource limited environments.
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research
Recent advancements in AI agents have demonstrated their growing potential to drive and support scientific discovery. In this work, we introduce MLR-Bench, a comprehensive benchmark for evaluating AI agents on open-ended machine learning research. MLR-Bench includes three key components: (1) 201 research tasks sourced from NeurIPS, ICLR, and ICML workshops covering diverse ML topics; (2) MLR-Judge, an automated evaluation framework combining LLM-based reviewers with carefully designed review rubrics to assess research quality; and (3) MLR-Agent, a modular agent scaffold capable of completing research tasks through four stages: idea generation, proposal formulation, experimentation, and paper writing. Our framework supports both stepwise assessment across these distinct research stages, and end-to-end evaluation of the final research paper. We then use MLR-Bench to evaluate six frontier LLMs and an advanced coding agent, finding that while LLMs are effective at generating coherent ideas and well-structured papers, current coding agents frequently (e.g., in 80\% of the cases) produce fabricated or invalidated experimental results--posing a major barrier to scientific reliability.
Here's How AI Agents Can Protect EV Chargers
An AI agent system proposed by researchers in Spain promises to prevent energy theft and damage to EV chargers, as well as the critical energy infrastructure that powers them. The number of electric vehicles on roads around the world continues to grow. The boom in EV adoption has driven the development of accessible, fast, and efficient charging infrastructure. However, this expansion also brings with it new cybersecurity risks that have been not been widely studied, and for which there are still few viable solutions. Cristina Alcaraz, an infrastructure-security researcher at Spain's University of Malaga, explains that the liability of electric-vehicle charging stations is due to the fact that they integrate multiple physical and digital components.
Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition
AI agents are rapidly being deployed across diverse industries, but can they adhere to deployment policies under attacks? We organized a one-month red teaming challenge---the largest of its kind to date---involving expert red teamers attempting to elicit policy violations from AI agents powered by $22$ frontier LLMs. Our challenge collected $1.8$ million prompt injection attacks, resulting in over $60,000$ documented successful policy violations, revealing critical vulnerabilities. Utilizing this extensive data, we construct a challenging AI agent red teaming benchmark, currently achieving near $100\%$ attack success rates across all tested agents and associated policies. Our further analysis reveals high transferability and universality of successful attacks, underscoring the scale and criticality of existing AI agent vulnerabilities. We also observe minimal correlation between agent robustness and factors such as model capability, size, or inference compute budget, highlighting the necessity of substantial improvements in defense. We hope our benchmark and insights drive further research toward more secure and reliable AI agents.
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
Training effective AI agents for multi-turn interactions requires high-quality data that captures realistic human-agent dynamics, yet such data is scarce and expensive to collect manually. We introduce APIGen-MT, a two-phase framework that generates verifiable and diverse multi-turn agent data. In the first phase, our agentic pipeline produces detailed task blueprints with ground-truth actions, leveraging a committee of LLM reviewers and iterative feedback loops. These blueprints are then transformed into complete interaction trajectories through simulated human-agent interplay. We train a family of models---the xLAM-2-fc-r series with sizes ranging from 1B to 70B parameters. Our models outperform frontier models such as GPT-4o and Claude 3.5 on $\tau$-bench and BFCL benchmarks, with the smaller models surpassing their larger counterparts, particularly in multi-turn settings, while maintaining superior consistency across multiple trials. Comprehensive experiments demonstrate that our verified blueprint-to-details approach yields high-quality training data, enabling the development of more reliable, efficient, and capable agents.